A data mining approach to discover genetic and environmental factors involved in multifactorial diseases

نویسندگان

  • Laetitia Vermeulen-Jourdan
  • Clarisse Dhaenens
  • El-Ghazali Talbi
  • Sophie Gallina
چکیده

In this paper, we are interested in discovering genetic and environmental factors that are involved in multifactorial diseases. Experiments have been achieved by the Biological Institute of Lille and many data has been generated. To exploit these data, data mining tools are required and we propose a two-phase optimisation approach using a speci®c genetic algorithm. During the ®rst step, we select signi®cant features with a speci®c genetic algorithm. Then, during the second step, we cluster affected individuals according to the features selected by the ®rst phase. The paper describes the speci®cities of the genetic problem that we are studying, and presents in detail the genetic algorithm that we have developed to deal with this very large size feature selection problem. Results on both arti®cial and real data are presented. q 2001 Elsevier Science Ltd All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

چشم اندازی به نقش عوامل ژنتیکی و محیطی در بروز آسم

Background and purpose: Asthma is a chronic inflammatory disease of the airways that is caused by hypersensitivity to environmental allergens. Symptoms of asthma include shortness of breath, airway hyper-responsiveness, wheezing, and cough. The disease might vary from a mild to severe and intermittent to chronic disease. Asthma is known as a multifactorial disease due to the interaction of gene...

متن کامل

Data Mining for Genetics: A Genetic Algorithm Approach

MINING biological data is an emerging area of intersection between data mining and bioinformatics. Bio-informaticians have been working on the research and development of computational methodologies and tools for expanding the use of biological, medical, behavioral, or health-related data. Biological data mining aims to extract significant information from DNA, RNA and proteins. Many biological...

متن کامل

Feature Selection in Data-Mining for Genetics Using Genetic Algorithm

We discovered genetic features and environmental factors which were involved in multifactorial diseases. To exploit the massive data obtained from the experiments conducted at the General Hospital, Chennai, data mining tools were required and we proposed a 2-Phase approach using a specific genetic algorithm. This heuristic approach had been chosen as the number of features to consider was large...

متن کامل

Structural analysis of impacting factors of sustainable development in underground coal mining using DEMATEL method

Mining can become more sustainable by developing and integrating economic, environmental, and social components. Among the mining industries, coal mining requires paying a serious attention to the aspects of sustainable development. Therefore, in this work, we investigate the impacting factors involved in the sustainable development of underground coal mining from the structural viewpoint. For ...

متن کامل

Data Mining in Genome Wide Association Studies

The genetic basis for some human diseases, in which one or a few genome regions increase the probability of acquiring the disease, is fairly well understood. For example, the risk for cystic fibrosis is linked to particular genomic regions. Identifying the genetic basis of more common diseases such as diabetes has proven to be more difficult, because many genome regions apparently are involved,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Knowl.-Based Syst.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2002